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NOVEL COMPOUNDS 

This invention relates to a novel human gene (ZGGBP1) associated with affective 
neurological disorders such as bipolar affective disorder. The invention also relates to 
5 homologues of the ZGGBP1 gene in species such as rat and mouse useful in providing animal 
models of affective disorders. The invention further relates to both the cDNA and the 
structural gene and to fragments encoding functional domains within the gene. This invention 
also relates to means for producing the protein encoded by the gene and to means for 
regulating its production and activity in vivo. 

10 Affective disorders comprise a broad and heterogeneous category of psychiatric illness 

with a prevalence of up to 20% in the population. The most severe of these disorders is 
bipolar type I which affects approximately 1% of the population and this rate is fairly 
consistent across countries. The disease affects young adults, with a mean age of onset of 22 
years. Treatment depends upon the phase of the disease and pharmacological agents include 

15 lithium carbonate, carbamazepine or valproic acid, tricyclic antidepressants. Monoamine 
oxidase inhibitors and selective serotonin re-uptake inhibitors are now also being used. The 
success rate of individual drugs is variable and some patients are treated with a combination 
of agents, although most have some unwanted side-effects. At present the precise diagnosis of 
individual affective disorders is difficult and new, gene based, diagnostic methods are 

20 desirable. 

v Family, twin and adoption studies have suggested the importance of genetic 
predisposition to bipolar affective disorder. On this basis, several groups have undertaken 
genetic linkage analysis in families with a high incidence of the disorder to find a causal gene. 
Many of the studies show conflicting data, suggesting that a single gene is unlikely to be the 
25 cause. Rather, multiple interacting genetic traits may be involved. A recent study (Stine et al. 
1995) identified two regions on chromosome 18- showing linkage to the disease. 

The present invention is based on our discover}' of a novel gene which maps to 
1 8q21 and which unexpectedly shows appreciable sequence homology to the ned-4 gene on 
chromosome 15. Ned-4 is the human homologue of the mouse nedd-4 gene which is known 



• 
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to be differentially expressed during neural development, and to be involved in signal 
transduction. Human ned-4 has been shown (Schild et al. 1996, Straub et aL 1996) to be a 
negative regulator of a sodium channel which is deleted in Liddle's syndrome (a hereditary 
form of hypertension). 

5 Nedd-4 was originally isolated as a partial cDNA clone from a mouse brain 

library (Kumar et aL 1992 ) as one of a set of genes which were differentially expressed during 
development (Neural precursor cells expressed developmentally down-regulated). The 
derived amino acid sequence contains three copies of the WW domain (Andre & Springael 
1994, Bork & Sudol, 1994; Hofmann & Boucher, 1995), a Ca lipid binding (CaLB/C2) 

10 domain (Brose et al. 1995 ) and a Hect (homologous to the E6-AP carbodyl terminus) domain 
which has homology to a ubiquitin ligase (E3) enzyme (Huibregtse et al. 1995 ). The human 
homologue of nedd-4 (Ned-4) was isolated as an randomly cloned EST (KIAA0093) from 
immature myeloblast mRNA (Nomura et al. 1994) and shown by sequence comparison to 
have 86% identity at the amino acid level to the mouse sequence. The human sequence, 

1 5 however, has a fourth copy of the WW domain. 

The WW domain is a 40 amino acid sequence found in several unrelated proteins. The 
two highly conserved tryptophans give it its name. The function of the domain is thought to 
be involved in protein-protein interactions. Despite their functional diversity, the proteins 
listed all appear to be involved in cell signalling or regulation. It has been shown that the WW 

20 domains of Nedd-4 ineract with the proline-rich PY motifs in the epithelial sodium channel in 
th£ kidney (Schild et al. 1^96). Mutational deletion of the PY motifs in the epitheliam 
sodium chanel in Liddle's syndrome, an inherited disease causing systemic hypertension 
characterised by hyperactivity of the sodium channel, has been shown to abrogate binding of 
Nedd-4 (Straub etal. 1996). It is therefore likely that Nedd-4 has a negative regulatory role 

25 when bound to the channel. 

The Hect domain is an E3 ubiquitin-protein ligase domain and enzymes with this 
domain catalyse polyubiquitination, which is involved in several cellular processes including 
proteolytic degradation, processing protein trafficking. 
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The CaLB/C2 domain is thought to be involved in calcium-dependent phospholipid 
binding, although some proteins containing this domain do not bind calcium and other 
putative functions for the C2 domain such as binding to inositol -1,3,4,5-tetraphosphate have 
been suggested. Examples of proteins containing this domain are Protein Kinase C (PKC) 
5 isoenzymes and synaptogamins. 

Therefore in a first aspect of the present invention we provide the ZGGBP1 gene 
having the cDNA as set out in Figure 1, and fragments thereof. By fragments we mean 
contiguous regions of the gene including complementary DNA and RNA sequences, starting 
with short sequences useful as probes or primers of say about 8-50 bases, such as 10-30 bases 

10 or 15-35 bases, to longer sequences of up to 50, 100, 200, 500 or 1000 bases. Indeed any 
convenient fragment of the gene of say up to 2kb, 3kb, 4kb or more than 4kb may be a useful 
gene fragment for further research, therapeutic or diagnostic purposes. Further convenient 
fragments include those whose terminii are defined by restriction sites within the gene of one 
or more kinds, such as any combination of Rsal, Alul and Hinfl. 

15 In a further aspect of the invention we provide homologues of the ZGGBP1 gene in 

species such as rat and mouse useful in providing animal models of affective disorders. The 
full sequences of the individual homologues may be determined using conventional 
techniques such as hybridisation, PCR and sequencing techniques, starting with any 
convenient part of the sequence set out in Figure 1. The partial sequence of the mouse gene is 

20 set out in Figure 4 and this gene and the protein encoded by this gene represent further 
independent aspects of the invention. 

In a further aspect of the invention we provide a recombinant ZGGBP1 protein 
obtained by expression of all or a part of the cDNA as set out in Figure 1. The recombinant 
protein may comprise all or a convenient part of the peptide sequence set out in Figure 1 . The 

25 production of a protein according to the invention may be achieved using standard 
recombinant DNA techniques involving the expression of the protein by a host cell as 
described for example by Sambrook et al. 1989. The isolated nucleic acids described herein 
may for example be introduced into any convenient expression vector for example the T7 
Studier system for expression in E.coli (US-A-4952496). Pichia pastoris for expression in 
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yeast, the Baculovirus system for expression in insect cells and the GS system for expression 
in mammalian cells by operatively linking the DNA to any necessary expression control 
elements therein and transforming any suitable procaryotic or eukaryotic host cell with the 
vector using well known procedures. 
5 Therefore in a further aspect of the invention we provide a recombinant plasmid 

comprising all or a part of the ZGGBP1 cDNA of the invention. 

The invention further extends to cells containing said recombinant plasmids and to a 
process for producing a ZGGBP1 protein of the invention which comprises culturing said 
cells such that the desired protein is expressed and recovering the protein from the culture. 

10 By way of example, the nucleotide sequence in Figure 1 is inserted downstream of the 

SV40 promoter in the pGEX plasmid vector, and either transiently or stably expressed in COS 
-7 cells. Expression of the protein according to the invention can be detected following 
disruption of the cells by Western blotting . 

It may be desirable to produce the individual functional domains of the protein 

1 5 according to the invention in isolation from the rest of the molecule. This may be achieved 
using the above standard recombination DNA techniques except that in this instance the DNA 
sequence used is that encoding one of the partial amino acid sequences of the domains 
identified in Figure 1 or a combination of these. 

By way of further example, the nucleotide sequence in Figure 1 is inserted 

20 downstream of the SV40 promoter and the glutathione-S-transferase (GST) coding sequence 
in the pBC plasmid vector, and either transiently or stably expressed in COS -7 cells allowing 
expression of the corresponding fusion protein. Expression of the fusion protein can be 
detected following disruption of the cells by Western blotting with antibodies to GST, and 
furthermore the fusion protein can be used in an affinity binding procedure to find proteins 

25 which are functional partners of the protein of the invention from cell extracts. 

A ZGGBP1 protein of the invention may in particular be used to screen for 
compounds which regulate the activity of the enzymes and the invention extends to such a 
screen and to the use of compounds obtainable therefrom to regulate the activity of the protein 
in vivo. 
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Thus according to a further aspect of the invention we provide a method for 
identifying a compound capable of modulating the action of a ZGGBP1 protein which method 
comprises subjecting one or more test compounds to a screen comprising (A) a protein 
containing the amino acid sequence shown in Figure 1 or a homologue or fragment thereof 
5 containing a nucleotide sequence in Figure 1 or a homologue or fragment thereof. 

The screen according to the invention may be operated using conventional procedures, 
for example by bringing the test compound or compounds to be screened and an appropriate 
substrate into contact with the protein or a cell capable of producing it and determining 
affinity for the protein in acccordance with conventional procedures. 

1 0 Any compound identified in this way may be used in the treatment of humans and/or 

other animals of one or more of the above mentioned diseases. The invention thus extends to 
a compound selected through its ability to regulate the activity of the protein in vivo as 
primarily determined in a screening assay utlising the protein containing an amino acid 
sequence shown in Figure 1 or a homologue or fragment thereof, or a gene coding therefor for 

1 5 use in the treatment of a disease in which the over- or under-activity or unregulated activity of 
the protein is implicated. 

The ZGGBP1 gene of the invention may also be used as the basis for diagnosis, for 
example to determine expression levels in a human subject, by for example direct DNA 
sequence comparison or DNA/RNA hybridisation assays. Diagnostic assays may involve the 

20 use of nucleic acid amplification technology such as the PCR and in particular the 

Amplification Refractory Mutation System (ARMS) as claimed in our European Patent No. 0 
332 435. Such assays may be used to determine allelic variants of the gene, for example 
insertions, deletions an A'' or mutations such as one or more point mutations. Such variants 
may be heterozygous or homozygous. 

25 The ZGGBP1 gene may also be used in gene therapy, for example where it is desired 

to modify the production of the protein in vivo, and the invention extends to such uses. 

Knowledge of the gene according to the invention also provides the ability to regulate 
its expression in vivo by for example the use of antisense DNA or RNA. Thus, according to a 
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further aspect of the invention we provide an antisense DNA or an antisense RNA of a gene 
for cooperation with the nucleotide sequence shown in Figure 1. 

The antisense DNA or RNA for cooperation with the gene in Figure 1 can be produced 
using conventional means, by standard molecular biology and/or by chemical synthesis as 
5 described above. If desired,the antisense DNA or antisense RNA may be chemically modified 
so as to prevent degradation in vivo or to facilitate passage through a cell membrane and/or a 
substance capable of inactivating mRNA, for example ribozyme, may be linked thereto and 
the invention extends to such constructs. 

The antisense DNA or antisense RNA may be of use in the treatment of diseases or 
10 disorders in humans in which the over- or underregulated production of the gene product has 
been implicated. Such diseases or disorders may include those described under the general 
headings of neurologic, eg. stroke, dementia, renal eg. hypertension, nephrosis, cardiovascular 
disorders. 

Convenient DNA sequences may be obtained using conventional molecular biology 
15 procedures, for example by probing a human genomic or cDNA library with one or more 
labelled oligonucleotide probes containing 10 or more contiguous nucleotides designed using 
the nucleotide sequences described here. Alternatively, pairs of oligonucleotides one of 
which is homologous to the sense strand and one to the antisense strand, designed using the 
nucleotide sequences described here to flank a specific region of DNA may be used to amplify 
20 that DNA from a cDNA library'. 

^ The ZGGBP1 protein of the invention and homologues or fragments thereof may be 
used to generate substances which selectively bind to it and in so doing regulate the activity of 
the enzymes. Such substances include, for example, antibodies, and the invention extends in 
particualr to an antibody which is capable of recognising one or more epitopes containing the 
25 protein binding domains shown in Figure 1. In particular the antibody may be neutralising 
antibody. 

As used herein the term antibody is to be understood to mean a whole antibody or a 
fragment thereof for example a F(ab)2, Fab, FV,. VH or VK fragment, a single chain 
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antibody, a multimeric monospecific antibody or fragment thereof, or a bi- or multi-specific 
antibody or fragment thereof. 

The invention will now be illustrated but not limited by reference to the following 
detailed description, References, Examples and Figures wherein: 

5 

Figure 1 shows the sequence of the ZGGBP1 cDNA and predicted amino acid translation. 
The C2 domain is indicated by carets, the four WW domains are indicated by asterisks and the 
Hect domain is indicated by underlining. 

Figure 2 shows a comparison of amino acid sequences of human ned4 Swissprot entry 
10 P46934 andZGGBPl. 

Figure 3 shows a Northern blot analysis of various human tissues probed with ZGGBPL 
Figure 4 shows a comparison of the nucleic acid sequences of human and mouse ZZGBP1. 
The mouse sequence is a partial cDNA which spans the C-terminal portion of the human 
protein coding region. 

15 
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Example 1 

Identification of ZGGBP1 

We used two methods for investigating the 18q21 region of interest. In one method 
15 we used positional cloning to identify novel transcripts from physical clones representing the 
region and in a second method we utilised public databases to identify transcripts which had 
been assigned to a low resolution map of the region by radiation hybrid mapping and assigned 
them to physical clones representing a high resolution map of the region. 

20 Method 1 - Positional Cloning 

v The 18q21 region described by Stine et al. (1995) is delimited by the STS markers 
used by that group to identify linkage. They found the most strongly linked marker to be 
Dl 8S41 , which had a LOD score of 3.5 1 in cases of paternal inheritance. Linkage declined 
over flanking markers. We identified a set of four Yeast Artificial Chromosomes (YACs) 
25 which comprised a contiguous overlapping set of genomic clones covering the defined region 
by the presence in those YACs of STS markers used in the Stine study. 

DNA from the YACs was prepared and used in a PCR-based hybridisation approach 
to enrich for transcripts from a human fetal brain cDNA library. This approach, known as 




direct selection (Lovett et al. 1991) has been shown to be efficient in identifying transcripts 
present on large genomic clones. 

Method 2 - Refining Radiation Hybrid Mapped Transcripts 

5 The UNIGENE database is a repository for transcripts which have been mapped by 

taking representative Expressed Sequence Tagged Sites (ESTs) and performing PCR analysis 
on a panel of radiation hybrids which have been calibrated with respect to a framework of 
1000 genetic markers (Schuler et al. 1996). We found 36 EST clusters which had been 
mapped to a radiation hybrid map interval which corresponded to the 18q21 region of interest 
10 and to flanking regions outside. 

All the ESTs were tested by PCR on our YAC genomic clones to determine which 
were present. We found approximately half of the ESTs to be present within the genomic 
clones and were able to order them based on their position within the YAC contig. 

15 Results 

Several clones from our direct selection experiments showed sequence homology to a 
known EST which we had previously shown to be present in two of the YACs within the 
contig. The EST was representative of a cluster of sequences. All of these sequences were 
assembled together using DNAStar Seqman and the consensus sequences obtained were used 

20 iteratively to search for other database members within both Unigene, dbEST and EMBL 
databases. This resulted in the surprising identification of two further clusters of ESTs which 
had previously not been related to each other on the basis of sequence analysis. The two new 
EST clusters were annotated as having sequence similarity to ned-4. This was an unexpected 
finding since we had recently mapped the human ned-4 by Fluorescence In Situ 

25 Hybridisation (FISH) to chromosome 1 5. We w r ere aware that ned-4 was involved in 
neuronal cell signalling and we concluded that the EST cluster on 18q21 must represent a 
closely related gene and therefore likely to be involved in affective neurological disorders 
such as bipolar affective disorder. 
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The assembly of the EST clusters did not give rise to a single complete contiguous 
sequence. The reason for this is that many of the EST sequences were derived from IMAGE 
cDNA clones for which end sequence only was available. In order to fill in the gaps and give 
a complete contig, four of these clones (IMAGE LD. 80951, 33059, 79526 and 79984) were 
5 sequenced completely to fill the gaps and give an entire complete contiguous sequence. 
Comparison of the sequence with ned-4 showed that the contig comprised 2kb of 
3' Untranslated Region (UTR) and 700bp of the coding region of a gene which had 
approximately 85% identity at the amino acid level to ned-4 and which we named ZGGBP1. 

1 0 Isolation of the full length gene for ZGGBP1 

The exending of partial transcripts to full length clones can be a complex and difficult 
process requiring skill and expertise for success. Having considered several possibilities, we 
opted for a PCR-based approach to isolate and characterise the full length ZGGBP1 gene. 
Human foetal brain double stranded cDNA was synthesised from mRNA using standard 

15 methods (Sambrook et al. 1989) and ligated into lambda Zap vector by use of adapters. 
However, in order to minimise the loss of transcripts often seen following the cloning step, 
the resulting ligation mix was not cloned but was instead used as a template for PCR. 
Oligonucleotide primers specific to ZGGBP1 were used in combination with vector specific 
primers to amplify DNA across the unknown part of the gene. Since the distance to be 

20 covered was unknown, we performed long PCR using the commercially available BCL 
Expand enzyme and long (30mer) oligonucleotide primers. Since we were using 
unamplified material, where our target cDNAs were likely to be present only in very small 
amounts, we utilised a secondary PCR step with nested oligonucleotide primers and again 
using long PCR to yield sufficient PCR products to be visible by gel analysis and also to 

25 minimise the possibility of non-specific PCR amplification. The PCR products derived from 
these experiments were then purified and sequenced directly. Where necessary, the DNA 
sequence obtained was used to design further primers to walk along the gene in a 3' - 5' 
direction. The complete nucleotide sequence derived from this work is 4.8kb and the 
translated amino acid sequence is shown in Figure 1 . 




The amino acid sequence derived from the cDNA was compared with that of ned-4 
and is shown in Figure 2. The proteins diverge markedly towards the N-terminal portion of 
the protein, although there is conservation of the common functional motifs. 

Northern analysis using a probe derived from the 3'UTR of ZGGBP1 showed a band 
5 at approximately 4.8kb but also a more abundant band of 9kb in size in several neurological 
tissues, with the exception of medulla or spinal cord. These bands are likely to be due to 
alternative splicing (Figure 3). Other tissues contained the 4.8kb band at higher abundance 
with respect to the 9kb band and also a 4kb band. ZGGBP1 was expressed in all tissues 
examined with the exception of liver where we could not detect a transcript at our current 
10 detection sensitivity. 

Comparison of Amino Acid Sequences of human ned-4 and ZGGBP1 

A comparison of the amino acid sequences of human ned-4 and ZGGBP1 is shown in 
Figure 6. The two proteins have a high level of homology over much of the C-terminal 
1 5 region, including the Hect and WW domains, but diverge over the central portion of the 

protein. There is a further block of homology near to the N-terminal region, including the C2 
domain. The presence of these domains in ZGGBP1 suggests some common functionality 
with ned-4. 

20 Isolation of Genomic Clone for ZGGBP1 

v The Research Genetics human Bacterial Artificial Chromosome (BAC) library (Shizua 
et a!. 1992, Kim et al. 1996) was screened by PCR using primers specific to the 3'UTR of 
ZGGBP1 and BACs were isolated. These will be used to characterise the structural gene 
including the intron/exon structure and the 5' regulatory region. 

25 

Isolation of Mouse homologue for ZGGBP1 

The full length sequence of ZGGBP1 shown in Figure 1 was used to search the dbEST 
database to identify homologous mouse sequences. Three overlapping IMAGE clones were 
identified (IMAGE I. D. 479436, 573510, 482922) comprising a partial transcript. Comparison 
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of the mouse and human nucleotide sequence is shown in Figure 4. The mouse clones were 
isolated for use as a probe for in situ hybridisation on sections of mouse brain during 
development, and as a probe of mouse genomic libraries to isolate genomic clones and to 
produce transgenic mice by gene targeting using homologous recombination. 
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FIGURE 1 



Sequence of ZGGBP1 cDNA and predicted amino acid translation 
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AAG 

Lys 


GAC 
Asp 


ATC 
1 le 


TTT 
Phe 


GGA 
Gly 


GCC 
Ala 


AGT 
Ser 


GAT 
Asp 


CCG 
Pro 


TAT 
Tyr 


181 
61 


GTG 
Vol 


AAA 
Lys 


CTT 
Leu 


TCA 
Ser 


TTG 
Leu 


TAC 
Tyr 


GTA 

Val 


GCG 
Ala 


GAT 
Asp 


GAG 
Glu 


AAT 

Asn 


AGA 
Arg 


GAA 
Glu 


CTT 
Leu 


GCT 
Ala 


TTG 
Leu 


GTC 
Val 


CAG 
Gin 


ACA 
Thr 


AAA 
Lys 


241 
81 


ACA 

Thr 


ATT 
I ie 


AAA 

Lys 


AAG 
Lys 


ACA 

Thr 


CTG 
Leu 


AAC 
Asn 


CCA 
Pro 


AAA 
Lys 


TGG AAT 
Trp Asn 


GAA 
Glu 


GAA 
Glu 


TTT 
Phe 


TAT 
Tyr 


TTC 
Phe 


AGG GTA 
Arg Val 


AAC 

"Asn 


CCA 
Pro 


301 
101 


TCT 
Ser 


AAT 
Asn 


CAC 
His 


AGA 
Arg 


CTC 
Leu 


CTA 
Leu 


TTT 
Phe 


GAA 
Glu 


GTA 
Val 


TTT 
Phe 


GAC 
Asp 


GAA 
Glu 


AAT 
Asn 


AGA 
Arg 


CTG 
Leu 


ACA 
Thr 


CGA GAC 
Arg Asp 


GAC 
Asp 


TTC 
Phe 


361 
121 


CTG 
Leu 


GGC CAG 
Gly Gin 


GTG 
Val 


GAC 
Asp 


GTG 
Val 


CCC 
Pro 


CTT 
Leu 


AGT 
Ser 


CAC 
His 


CTT 
Leu 


CCG 
Pro 


ACA 
Thr 


GAA 
Glu 


GAT 
Asp 


CCA 
Pro 


ACC 
Thr 


ATG 
Met 


GAG 
Glu 


CGA 
Arg 


421 
141 


CCC 
Pro 


TAT 
Tyr 


ACA 
Thr 


TTT 
Phe 


AAG 
Lys 


GAC 
Asp 


TTT 
Phe 


CTC 
Leu 


CTC 
Leu 


AGA 
Arg 


CCA 
Pro 


AGA 
Arg 


AGT 
Ser 


CAT 
His 


AAG 
Lys 


TCT 
Ser 


CGA 
Arg 


GTT 
Val 


AAG 
Lys 


GGA 
Gly 


481 
161 


TTT 
Phe 


TTG 
Leu 


CGA 
Arg 


TTG 
Leu 


AAA 
Lys 


TGG 
Trp 


CCT 
Pro 


ATA 
1 le 


TGC 
Cys 


CAA 
Gin 


AAA 
Lys 


AAT 
Asn 


GGA 
Gly 


GGT 
Gly 


CAA 

Gin 


GAT 
Asp 


GAA 
Glu 


GAA 
Glu 


AAC 
Asn 


AGT 
Ser 


541 
181 


GAC 
Asp 


CAG 
Gin 


AGG 
Arg 


GAT 
Asp 


GAC 
Asp 


ATG 
Met 


GAG 
G 1 u 


CAT 
His 


GGA 
G 1 y 


TGG 
Trp 


GAA 
Gl u 


GTT 

Val 


GTT 
Val 


GAC 
Asp 


TCA 
Ser 


AAT 
Asn 


GAC 
Asp 


TCG 
Ser 


GCT 
Ala 


TCT 
Ser 


601 
201 


C AG 
Gin 


CAC 
H 1 s 


CAA 
G 1 n 


GAG 
Glu 


GAA 
Glu 


CTT 
Leu 


CCT 
Pro 


CCT 
Pro 


CCT 
Pro 


CCT 
Pro 


CTG 
Leu 


CCT 
Pro 


CCC 
Pro 


GGG 
Gly 


TGG 
Trp 


GAA 
Glu 


GAA 
Glu 


AAA 

Lys 


GTG 
Va! 


GAC 
Asp 


661 
221 


AAT 
Asn 


TTA 
Leu 


GGC 
Gly 


CGA 
Arg 


ACT 
Thr 


TAC 
Tyr 


TAT 

Tyr 


GTC 
Va 1 


AAC 
Asn 


CAC 
H i s 


AAC 

Asn 


AAC 
Asn 


CGG 
Arg 


ACC 
Thr 


ACT 
Thr 


CAG 
Gin 


TGG CAC 
Trp His 


AGA 
Arg 


CCA 
Pro 


721 
24 1 


AGC 
Ser 


CTG 
Leu 


ATG 
Met 


GAC 
Asp 


GTG 
Va 1 


TCC 
Ser 


TCG 
Ser 


GAG 
G 1 u 


TCG 
Ser 


GAC 
Asp 


AAT 

Asn 


AAC 
Asn 


ATC 
I le 


AGA 
Arg 


CAG 
Gin 


ATC 
I le 


AAC 
Asn 


CAG 
Gin 


GAG 
G 1 u 


GCA 
A 1 a 


781 
261 


GCA 
Ala 


CAC 
His 


CGG 
Arg 


CGC 
Arg 


TTC 
Phe 


CGC 
Arg 


TCC 
Ser 


CGC 
Arg 


AGG 
Arg 


CAC 
His 


ATC 
I le 


AGC 
Ser 


GAA 
G 1 u 


GAC 
Asp 


TTG 
Leu 


GAG 
Glu 


CCC 
Pro 


GAG 
Glu 


CCC 
Pro 


TCG 
Ser 


e^ 1 
261 


GAG 
G t u 


GGC 
G 1 y 


GGG 
G 1 y 


GAT 
Asp 


GTC 
Val 


CCC 
Pro 


GAG 
G 1 u 


CCT 
Pro 


TGG 
Trp 


GAG 
G 1 u 


ACC 
Thr 


ATT 
1 le 


TCA 
Ser 


GAG 
Glu 


GAA 

Glu 


GTG 
Val 


AAT 
Asn 


ATC 
I le 


GCT 
Ala 


GGA 
G 1 y 


901 
301 


GAC 
Asp 


TCT 
Ser 


CTC 
Leu 


GGT 
G 1 y 


CTG 
Leu 


GTT 
Va I 


TTG 
Lou 


CCC 
Pro 


CCA 

Pro 


CCA 
Pro 


CCA 
Pro 


GCC 
A 1 a 


TCC 
Ser 


CCA 
Pro 


GGA 
Gly 


TCT 
Ser 


CGG 
Arg 


ACC 
Thr 


AGC 
Ser 


CCT 
Pro 


961 
321 


C AG 
G 1 n 


GAG 
G ) u 


CTG 
Leu 


TCA 
Ser 


GAG 
G l u 


GAA 
G I u 


CTA 
Leu 


AGC 
Ser 


AGA 
Arg 


AGG 
Arg 


CTT 
Leu 


CAG 
G 1 n 


ATC 
I le 


ACT 
Thr 


CCA 
Pro 


GAC 
Asp 


TCC 
Ser 


AAT 
Asn 


GGG 
Gly 


GAA 

G 1 u 


1021 
34 1 


C AG 
G 1 n 


TTC 
Phe 


AGC 
Ser 


TCT 
Ser 


TTG 
Leu 


ATT 
I le 


CAA 
G 1 n 


AGA 
Arg 


GAA 
G 1 u 


CCC 
Pro 


TCC 
Ser 


TCA 
Ser 


AGG 
Arg 


TTG 
Leu 


AGG 
Arg 


TCA 
Ser 


TGC 
Cys 


AGT 
Ser 


GTC 
Va! 


ACC 
Thr 


1081 
361 


GAC 
Asp 


GCA 
A 1 a 


GTT 
Val 


GCA 
A 1 a 


GAA 
G 1 u 


CAG 
G i n 


GGC 
Gly 


CAT 
H i s 


CTA 
Leu 


CCA 
Pro 


CCG 
Pro 


CCA 
Pro 


TCA 
Ser 


GTG 
Val 


GCC 
Ala 


TAT 
Tyr 


GTA 
Val 


CAT 
His 


ACC 
Thr 


ACG 
Thr 


1 141 

381 


CCG 
Pro 


GGT 
Gly 


CTG 
Leu 


CCT 
Pro 


TCA 

Ser 


GGC 
Gly 


TGG 
Trp 


GAA 

G 1 u 


GAA 

G 1 u 


AGA 
Arg 


AAA 
Lys 


GAT 
Asp 


GCT 
Ala 


AAG 
Lys 


GGG 
Gly 


CGC 
Arg 


ACA 
Thr 


TAC 
Tyr 


TAT 
Tyr 


GTC 
Va I 


1201 
401 


AAT 
Asn 


CAT 
H i s 


AAC 
Asn 


AAT 
Asn 


CGA 
Arg 


ACC 
Thr 


ACA 
Thr 


ACT 
Thr 


1 1J o 

Trp 


ACT 
Thr 


CGA 
Arg 


CCT 
Pro 


ATC 
1 le 


ATG 
Met 


CAG 
G 1 n 


CTT 
Leu 


GCA 
Ala 


GAA 
Glu 


GAT 
Asp 


GGT 
G 1 y 


1261 
42 1 


GCG 
Ala 


TCC 
Ser 


GGA 
G I y 


TCA 
Ser 


GCC 
A 1 a 


ACA 
Thr 


AAC 
Asn 


AGT 
Ser 


AAC 
Asn 


AAC 
Asn 


CAT 
H t s 


CTA 
Leu 


ATC 
I le 


GAG 
Glu 


CCT 
Pro 


CAG 
G 1 n 


ATC 
I le 


CGC 
Arg 


CGG 
Arg 


CCT 
Pro 


1321 
44 1 


CGT 
Arg 


AGC 
Ser 


CTC 
Leu 


AGC 
Ser 


TCG 
Ser 


CCA 
Pro 


ACA 

Thr 


GTA 

Val 


ACT 
Thr 


TTA 
Leu 


TYT 
Xxx 


GCC 
A 1 a 


CCG 
Pro 


CTG 
Leu 


GAG 

Glu 


GGT 
Gly 


GCC 
Ala 


AAG 
Lys 


GAC 
Asp 


TCA 
Ser 


1331 
461 


CCC 
Pro 


GTA 
Va 1 


CGT 
Arg 


CGG 
Arg 


GCT 
A 1 a 


GTG 
Va 1 


AAA 

Lys 


GAC 
Asp 


ACC 
Thr 


CTT 
Leu 


TCC 
Ser 


AAC 
Asn 


CCA 
Pro 


CAG 

G 1 n 


TCC 
Ser 


CCA 
Pro 


CAG 
Gin 


CCA 
Pro 


TCA 
Ser 


CCT 
Pro 



1441 TAC AAC TCC CCC AAA CCA CAA CAC AAA GTC ACA CAG AGC TTC TTG CCA CCC GGC TGG GAA 
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FIGURE ] (continued) 



481 


Tyr 


Asn 


Ser 


Pro 


Lys 


Pro 


Gin 


His 


Lys 


Val 


Thr 


Gin 


Ser 


Phe 


Leu 


Pro 


Pro Gly 


Trp 


Glu 


1 50 1 
501 


ATG 
Met 


TGG 
Trp 


ATA 
I le 


GCG 
Ala 


CCA 
Pro 


AAC 
Asn 


GGC CGG CCC TTC TTC 
Gly Arg Pro Phe Phe 


ATT 
I le 


GAT 
Asp 


CAT 
His 


AAC 
Asn 


ACA 
Thr 


AAG 
Lys 


ACT 
Thr 


ACA 
Thr 


ACC 
Thr 


1 56 1 
521 


TGG 
Trp 


GAA 
Glu 


GAT CCA CGT TTG AAA 
Asp Pro Arg Leu Lys 


TTT 
Phe 


CCA 
Pro 


GTA 
Val 


CAT 
His 


ATG 
Met 


CGG 
Arg 


TCA 
Ser 


AAG 
Lys 


ACA 
Thr 


TTT 
Phe 


TTA 
Leu 


AAC 
Asn 


CCC 
Pro 


162 1 
541 


AAT 

Asn 


GAC 
Asp 


CTT 


GGC CCC CTT CCT 
Gly Pro Leu Pro 


CCT 
Pro 


GGC TGG GAA 
Gly Trp Glu 


GAA 
Giu 


AGA 
Arg 


ATT 
I le 


CAC 
His 


TTG 
Leu 


GAT GGC CGA ACG 
Asp Gly Arg Thr 


168 1 
561 


TTT 
Phe 


TAT 

Tyr 


ATT 
I 1 e 


GAT 
Asp 


CAT 
H 1 s 


AAT 
Asn 


AGC 
Ser 


AAA 
Lys 


ATT 
I 1 e 


ACT 
Thr 


CAG 
Gin 


TGG 
Trp 


GAA 

Glu 


GAC 
Asp 


CCA 
Pro 


AGA 
Arg 


CTG 
Leu 


CAG 
Gin 


AAC 
Asn 


CCA 
Pro 


1 74 1 
581 


GCT 
Ala 


ATT 
I le 


ACT 
Thr 


GGT 
Gly 


CCG 
Pro 


GCT 
A 1 a 


GTC 
Val 


CCT 
Pro 


TAC 
Tyr 


TCC 
Ser 


AGA 
Arg 


GAA 
Glu 


TTT 
Phe 


AAG 

Lys 


CAG 
Gin 


AAA 

Lys 


TAT 
Tyr 


GAC 
Asp 


TAC 
Tyr 


TTC 
Phe 


1801 
601 


AGG 
Arg 


AAG 
Lys 


AAA 
Lys 


TTA 
Leu 


AAG 
Lys 


AAA 
Lys 


CCT 
Pro 


GCT 
Ala 


GAT 
Asp 


ATC 
I le 


CCC 
Pro 


AAT 
Asn 


AGG 
Arg 


TTT 
Phe 


GAA 
Glu 


ATG 
Met 


AAA 
Lys 


CTT 
Leu 


C*C 
His 


AGA 
Arg 


1 861 

621 


AAT 
Asn 


AAC 
Asn 


ATA 
I 1 e 


TTT 
Phe 


GAA 
G 1 u 


GAG 
G 1 u 


TCC 
Ser 


TAT 
Tyr 


CGG 
Arg 


AGA 
Arg 


ATT 
I le 


ATG 
Met 


TCC 
Ser 


GTG 
Val 


AAA 

Lys 


AGA 
Arg 


CCA 
Pro 


GAT 
Asp 


GTC 
Val 


CTA 
Leu 


192 1 
641 


AAA 
Lys 


GCT 
Ala 


AGA 


CTG 
Leu 


TGG 
Trp 


ATT 
I 1 e 


GAG 
G 1 u 


TTT 
Phe 


GAA 
G 1 u 


TCA 
Ser 


GAG 
G 1 u 


AAA 
Lys 


GGT 
Gly 


CTT 
Leu 


GAC 
Asp 


TAT 
Tyr 


GGG 
Gly 


GGT 
Gly 


GTG 
Val 


GCC 
Ala 


198 1 
661 


AGA 
Arg 


GAA 
Glu 


TGG 

Trn 


TTC 
Phe 


TTC 
Phe 


TTA 
Leu 


CTG 
Leu 


TCC 
Ser 


AAA 

Lys 


GAG 
G 1 u 


ATG 
Met 


TTC 
Phe 


AAC 
Asn 


CCC 
Pro 


TAC 
Tyr 


TAC 
Tyr 


GGC 
G 1 y 


CTC 
Leu 


TTT 
Phe 


GAG 
G 1 u 


204 1 
681 


TAC 
Tyr 


TCT 
Ser 


GCC 
A 1 a 


ACG 
Thr 


GAC 
Asp 


AAC 
Asn 


TAC 
Tyr 


ACC 
Thr 


CTT 
Leu 


CAG 
Gin 


ATC 
I le 


AAC 
Asn 


CCT 
Pro 


AAT 
Asn 


TCA 
Ser 


GGC 
Gly 


CTC 
Leu 


TGT 
Cys 


AAT 
Asn 


GAG 
Glu 


2101 
701 


GAT 
Asp 


CAT 
His 


TTG 
Leu 


TCC 
Ser 


TAC 
Tyr 


TTC 
Phe 


ACT 

Thr 


TTT 
Phe 


ATT 
I te 


GGA 
Gly 


AGA 
Arg 


GTT 
Val 


GCT 
Ala 


GGT 
Gly 


CTG 
Leu 


GCC 
Ala 


GTA 
Va! 


TTT 
Phe 


CAT 
His 


GGG 
Gly 


2161 
721 


AAG 

Lys 


CTC 
Leu 


TTA 
Leu 


GAT 
Asp 


GGT 
Gly 


TTC 
Phe 


TTC 
Phe 


ATT 
1 le 


AGA 
Arg 


CCA 
Pro 


TTT 
Phe 


TAC 
Tyr 


AAG 
Lys 


ATG 
Met 


ATG 
Met 


TTG 
Leu 


GGA 
Gly 


AAG 
Lys 


CAG 
Gin 


ATA 
I le 


tl t_ 1 

741 


ACC 
Thr 


C TG 
Leu 


AAT 
Asn 


GAC 
Asp 


ATG 
Met 


GAA 
Giu 


TCT 
Ser 


GTG 
Val 


GAT 
Asp 


AGT 
Ser 


GAA 
G 1 u 


TAT 
Tyr 


TAC 
Tyr 


AAC 
Asn 


TCT 
Ser 


TTG 
Leu 


AAA 

Lys 


TGG 
Trp 


ATC 
I le 


CTG 
Leu 


2281 
761 


GAG 

Giu 


AAT 
Asn 


GAC 
Asp 


CCT 
Pro 


ACT 
Thr 


GAG 
G 1 u 


CTG 
Leu 


GAC 
Asp 


CTC 
Leu 


ATG 
Met 


TTC 
Phe 


TGC 
Cys 


ATA 
I 1 e 


GAC 
Asp 


GAA 
G 1 u 


GAA 
Glu 


AAC . 
Asn 


TTT 
Phe 


GGA 
Gly 


CAG 
G ! n 


234 1 
78 1 


AC A 
Thr 


TAT 
Tyr 


CAA 
G 1 n 


GTG 
Vq 1 


GAT 
Asp 


TTG 
Leu 


AAG 
Lys 


CCC 
Pro 


AAT 
Asn 


GGG 
Gly 


TCA 
Ser 


GAA 

G 1 u 


ATA 
I le 


ATG 
Met 


GTC 
Val 


ACA 
Thr 


AAT 
Asn 


GAA 
Glu 


AAC 
Asn 


AAA 

Lys 


2^01 
60 ; 


AGG 
Arg 


GAA 
G I u 


TAT 
Tyr 


ATC 
I le 


GAC 
Asp 


TTA 
Leu 


GTC 
Va 1 


ATC 
I 1 e 


CAG 
Gin 


TGG 
Trp 


AGA 
Arg 


TTT 
Phe 


GTG 
Va 1 


AAC 
Asn 


AGG 
Arg 


GTC 
Val 


CAG 
G 1 n 


AAG 
Lys 


CAG 
G 1 n 


ATG 
Met 


2461 
821 


A AC 
Asn 


GCC 
A 1 a 


TTC 
Phe 


TTG 
Leu 


GAG 

G 1 u 


GGA 

Gly 


TTC 
Phe 


AC A 
Thr 


GAA 

G 1 u 


CTA 
Leu 


CTT 
Leu 


CCT 
Pro 


ATT 
I le 


GAT 
Asp 


TTG 
Leu 


ATT 
I le 


AAA 
Lys 


ATT 
1 ie 


TTT 
Phe 


GAT 
Asp 


2521 

641 


GAA 
G 1 u 


AAT 

Asn 


GAG 

Giu 


CTG 
Leu 


GAG 

Giu 


TTG 

Leu 


CTC 
Leu 


ATG 
Met 


TGC 
Cys 


GGC 
G 1 y 


CTC 
Leu 


GGT 
G 1 y 


GAT 
Asp 


GTG 
Va 1 


GAT 
Asp 


GTG 
Va! 


AAT 
Asn 


GAC 
Asp 


TGG 
Trp 


AGA 
Arg 


258 1 
861 


C AG 

G 1 n 


CAT 
H i s 


Se- 


ATT 
! le 


TAC 


AAG AAC 
J_ys _Asn 


GGC 
Hiy 


TAC 
JL*r_ 


TGC 
Cys 


CCA 
Prn 


AAC 

Ann 


CAC 
H i s 


CCC 
Prn 


GTC 
Vnl 


ATT 
1 Ir* 


CAG 

Gin 


TGG 
_Lcp_ 


TTC TGG 

PHp Trn 


264 1 


AAG 


GCT 


GTG 


CTA 


CTC 


ATG 


GAC 


GCC 


GAA 


AAG 


CGT 


ATC 


CGG 


TTA 


CTG 


CAG 


TTT 


GTC 


ACA 


GGG 



861 LyjS^AlQ_toL_Leu.±£u tl fit Asp A In Glu ly s Arg Mo A rg leu leu Gin Phe Val Thr Gly 

2701 ACA T CG CGA GTA CCT ATG AAT GGA TTT GCC" GAA CTT TAT GGT TCC AAT GGT CCT CAG CTG 

901 Ihr^Ser_Am_Vg_l_Prc Met A^n Gly Phe Aln n I ■ ■ l » M Ty r Gly .S er As n Gly Prn G in I pn 

2761 TTT AC A ATA GAG CAA TGG GGC AGT CCT GAG AAA CTG CCC AGA GCT CAC ACA TGC TTT AAT 

921 Pr,e _Ltir__; 1 e _Cu_u_GjLq_J rp G l y S e r Pro G l u I y^_Le u Pro A rg Aln His Thr Cy* Ph P A^ 

2821 CCC CTT GAC TTA CCT CCA TAT GAA ACC TTT GAA GAT TTA CGA GAG AAA CTT CTC ATG GCC 

94 1 Arg.Leu_Asp^eu^P_r^_£r^_JLyr Glu Thr Ph&_GuLu_Asp^Leu_. Ar g Glu Lys leu I p?u M P t_A_Lo 

2881 GTG GAA AAT GCT CAA GGA TTT GAA GGG GTG GAT TAA gca ccc tgt get teg ggg gtg gtt 

961 VaJ_Glu.Asn._A In G in Gly Php Giu Gly Vn | A sp 

2941 gtt ctt coa gca agt tct get tgc act ttt gca ttt gec too cog act ttt gca gag gcg 

3001 atq oca qaq aqc aqc toe aaa cat aat ccc tnn nor rnn nr-r- ft/- <-,<-,- <~^^ + ^ + 



306 
312 



324 
330 
336 
342 
348 
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FIGURE 1 (continued) 



Q ° 9 ttC 999 Qtg C " 9 ° a CCt 9gt ccc ° 9C tt9 a ^ t tcc t 9 c ctt tcc cac cac aaa tta 

3121 tea act ggt tga tgt gta cac taa tta cat ttc agg agg act taa tgc tat tta tgt tgt 

3181 cct ctg cag gca aag ccc tta ata oat att tta cat cct ttc taa tga caa tga atg gaa 

tta ate act caa cag gta tag tat tac gac tea tgt tta ctt ttt aaa atg att tag acc 

gat ttt cag att tta ttt cgt tat gat taa aga tgt etc atg tac ttg gaa aag tga gca 

ttt ttt ttt ttt ttt gta ttt cac ttt cat acc agg ctt aat gtc oat gac att ttt att 

ttt gaa gta etc tga cac etc cac cct eta ctt tat tag aat tgg aag gca aat ttt tgt 

cca aaa acc tac aga caa gta ctt tga gag aat ttc caa tat aat att aga cat aat gat 

3541 aat ttt ttc cat act cag aat gaa aaa ctg gat att acg ttt ttg ttt tgg ggt ttt ttt 

3601 gta caa att tag eta ata get aca ggc tga gag aat tgt aac ata gca tga caa att ttg 

3661 tgt tga ctt gaa agg aat cac acc att att cct tag aag taa tta cat gtg ttc taa cac 

3721 att tga gac ogg gtt gga etc cca ttt etc ate cga gaa att act taa ccc ttc ctg ggc 

3781 get gta cag tea tct ttt att eta ttt cct ctt tgc tgt ttg tag tag oga cat ttt ~gaa 

3841 tga aac ttg gca ctg ctt gat tea aaa ctg tgg aaa cca gat ctg ttt ogt etc ctg ttt 

3901 gta tgc gtt tgc taa tgg tag eta aat aac cag ttt ttg ttg taa atg cac caa ttc tga 

3961 agg cac ttt atg tac tac atg gag gtc ata tct ggt ttt gtt ttt att ttt tta tea tga 

4021 aca tta aat gtg atg atg att tct ttt ccc tgc aca cat ctt tcc ggt gca ata tct ate 

4081 aat tgt gaa tct ggc tgc tgg tgt ata aaa acc tgg atg taa age tga gec tac aga cct 

4141 gtc etc acc aac tgt ttt gtg att tct act caa eta caa aga ttt att taa tgt act ctt 

4201 aat eta act gag ttt tgt tac caa tga cct gtt gca tgc ttc aat acc gtg tac tgc ctg 

4261 agt tgt gec tct tgt gtg eta gat taa aag tga gac aga gac ttg act tga tcc tct gag 

4321 etc aag eta ttg age tgg tat tgg cax agg act gag ggt acc tgc aca gtt tga ttc ttt 

4381 tcc cac gtt gta agt etc cat tgc aga att gtc gtg ttt gag aaa aca cct gag gca gtg 

4441 tgg gag ttg aac gac cct get gtc cyt ttt aac ctg tgt tgt cct aga ccc tkJc teg ggg 

4501 sea gtc agg gga cac eta gag att tga tct cat gcg ogt cat caa tag gac aaa aaa gtt 

4561 gtg gtt tgg gga ggt ctg ttt gtt aca taa aaa gga cct ttc ggt gta aga aat tgc cgt 

4621 ttt tac cct gec ctg get ggc atg tga gaa gec atg gaa ggt tgt ggt tgt aaa tga gtt 

4681 gtc taa agg ggt gca gag gec tga ggt ttc taa aag aag gta gat ttc tac aga get gag 

4741 tgt tgg ttc ctt ttt ctt att ggt tga aaa tta cct ggt agt gat cag aaa act tag atg 
4801 eta tgt aac t 
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FIGURE 2 



Amino Acid Sequence Alignment of Human*NED-4, 
(SwissProt Accession No.P46934) and ZGGBP1 
using Clustal method with PAM250 residue "weight table 
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FIGURE 3 



Northern Blot Analysis of mRNAs 
Hybridised with ZGGBP1 
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FIGURE 4 



Nucleic Acid Sequence Alignment of Human and Mouse ZGGBP1 
using Clustal method with Weighted residue weight table 
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